NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

RankMap: Priority-Aware Multi-DNN Manager for Heterogeneous Embedded Devices

https://doi.org/10.23919/DATE64628.2025.10992856

Karatzas, Andreas; Stamoulis, Dimitrios; Anagnostopoulos, Iraklis (March 2025, IEEE)

Free, publicly-accessible full text available March 31, 2026
Less is More: Optimizing Function Calling for LLM Execution on Edge Devices

https://doi.org/10.23919/DATE64628.2025.10992798

Paramanayakam, Varatheepan; Karatzas, Andreas; Anagnostopoulos, Iraklis; Stamoulis, Dimitrios (March 2025, IEEE)

Free, publicly-accessible full text available March 31, 2026
MapFormer: Attention-based multi-DNN manager for throughout & power co-optimization on embedded devices

https://doi.org/10.1145/3676536.3676724

Karatzas, Andreas; Anagnostopoulos, Iraklis (October 2024, ACM)

Full Text Available
LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching

https://doi.org/10.1109/ICECS61496.2024.10848749

Singh, Simranjit; Fore, Michael; Karatzas, Andreas; Lee, Chaehong; Jian, Yanan; Shangguan, Longfei; Yu, Fuxun; Anagnostopoulos, Iraklis; Stamoulis, Dimitrios (November 2024, IEEE)

Full Text Available
Balancing Throughput and Fair Execution of Multi-DNN Workloads on Heterogeneous Embedded Devices

https://doi.org/10.1109/TETC.2024.3407055

Karatzas, Andreas; Anagnostopoulos, Iraklis (January 2024, IEEE Transactions on Emerging Topics in Computing)

Full Text Available
Pythia: An Edge First Agent for State Prediction in High-Dimensional Environments

https://doi.org/10.1109/LES.2024.3403090

Karatzas, Andreas; Anagnostopoulos, Iraklis (January 2024, IEEE Embedded Systems Letters)

Modern deep learning agents usually operate in low-dimensional environments. They process pixel input, don’t offer insights into their thought process, and require significant power and computational resources. These characteristics make them inapplicable for embedded devices. In this letter, we present Pythia, an edge-first framework that uses latent imagination to handle complex environments efficiently and envision future agent states. It utilizes a VQ-VAE to reduce the high-dimensional features into a low-dimensional space, making it ideal for modern embedded devices. Moreover, Pythia offers human interpretable feedback and scales well with respect to the design space. Pythia surpassed the other state-of-art models in prediction accuracy on both intrinsic and extrinsic metrics.
more » « less
Full Text Available
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization

https://doi.org/10.1109/TETC.2023.3346944

Balaskas, Konstantinos; Karatzas, Andreas; Sad, Christos; Siozios, Kostas; Anagnostopoulos, Iraklis; Zervakis, Georgios; Henkel, J¨org (January 2024, IEEE Transactions on Emerging Topics in Computing)

Deep Neural Networks (DNNs) have shown significant advantages in a wide variety of domains. However, DNNs are becoming computationally intensive and energy hungry at an exponential pace, while at the same time, there is a vast demand for running sophisticated DNN-based services on resource constrained embedded devices. In this paper, we target energy-efficient inference on embedded DNN accelerators. To that end, we propose an automated framework to compress DNNs in a hardware-aware manner by jointly employing pruning and quantization. We explore, for the first time, per-layer fine- and coarse-grained pruning, in the same DNN architecture, in addition to low bit-width mixed-precision quantization for weights and activations. Reinforcement Learning (RL) is used to explore the associated design space and identify the pruning-quantization configuration so that the energy consumption is minimized whilst the prediction accuracy loss is retained at acceptable levels. Using our novel composite RL agent we are able to extract energy-efficient solutions without requiring retraining and/or fine-tuning. Our extensive experimental evaluation over widely used DNNs and the CIFAR-10/100 and ImageNet datasets demonstrates that our framework achieves 39% average energy reduction for 1.7% average accuracy loss and outperforms significantly the state-of-the-art approaches.
more » « less
Full Text Available

Search for: All records